Telegram Group & Telegram Channel
🖥 PDF CRAFT-a python library for converting PDF (primarily scanned books) in Markdown and EPUB using local AI models and LLM to structure the contents
Github

Basic possibilities

- extracting text and layout
Uses the combination of Doclayout-Yolo and its own algorithms for detecting and filtering headlines, columns, footnotes and page numbers

- Local OCR
Recognizes the text on the page via Onnxocr, supports acceleration on GPU (CUDA)

- Determining the order of reading
With the help of LayoutReader, it builds a flow of text in the order in which it is perceived by a person

- Converting in Markdown
Generates .MD with relative links to images (illustrations, tables, formulas) in the Assets folder

Installation and requirements
Python ≥ 3.10 (recommended 3.10.16).

Pip Install PDF-Craft and PIP Install Onnxruntime == 1.21.0 (or Onnxruntime-GPU == 1.21.0 for CUDA).

For an EPUB conveier, you need access to the LLM service (for example, Deepseek).

🟡 Github


#پایتون #Python #library

🆔 @Python4all_pro



tg-me.com/Python4all_pro/1585
Create:
Last Update:

🖥 PDF CRAFT-a python library for converting PDF (primarily scanned books) in Markdown and EPUB using local AI models and LLM to structure the contents
Github

Basic possibilities

- extracting text and layout
Uses the combination of Doclayout-Yolo and its own algorithms for detecting and filtering headlines, columns, footnotes and page numbers

- Local OCR
Recognizes the text on the page via Onnxocr, supports acceleration on GPU (CUDA)

- Determining the order of reading
With the help of LayoutReader, it builds a flow of text in the order in which it is perceived by a person

- Converting in Markdown
Generates .MD with relative links to images (illustrations, tables, formulas) in the Assets folder

Installation and requirements
Python ≥ 3.10 (recommended 3.10.16).

Pip Install PDF-Craft and PIP Install Onnxruntime == 1.21.0 (or Onnxruntime-GPU == 1.21.0 for CUDA).

For an EPUB conveier, you need access to the LLM service (for example, Deepseek).

🟡 Github


#پایتون #Python #library

🆔 @Python4all_pro

BY پایتون ( Machine Learning | Data Science )


Warning: Undefined variable $i in /var/www/tg-me/post.php on line 283

Share with your friend now:
tg-me.com/Python4all_pro/1585

View MORE
Open in Telegram


پایتون Machine Learning | Data Science Telegram | DID YOU KNOW?

Date: |

How Does Telegram Make Money?

Telegram is a free app and runs on donations. According to a blog on the telegram: We believe in fast and secure messaging that is also 100% free. Pavel Durov, who shares our vision, supplied Telegram with a generous donation, so we have quite enough money for the time being. If Telegram runs out, we will introduce non-essential paid options to support the infrastructure and finance developer salaries. But making profits will never be an end-goal for Telegram.

NEWS: Telegram supports Facetime video calls NOW!

Secure video calling is in high demand. As an alternative to Zoom, many people are using end-to-end encrypted apps such as WhatsApp, FaceTime or Signal to speak to friends and family face-to-face since coronavirus lockdowns started to take place across the world. There’s another option—secure communications app Telegram just added video calling to its feature set, available on both iOS and Android. The new feature is also super secure—like Signal and WhatsApp and unlike Zoom (yet), video calls will be end-to-end encrypted.

پایتون Machine Learning | Data Science from ms


Telegram پایتون ( Machine Learning | Data Science )
FROM USA